Openspeech Transducer Model¶
Openspeech Transducer Model¶
-
class
openspeech.models.openspeech_transducer_model.
OpenspeechTransducerModel
(configs: omegaconf.dictconfig.DictConfig, tokenizer: openspeech.tokenizers.tokenizer.Tokenizer)[source]¶ Base class for OpenSpeech’s transducer models.
- Parameters
configs (DictConfig) – configuration set.
tokenizer (Tokenizer) – tokenizer is in charge of preparing the inputs for a model.
- Inputs:
inputs (torch.FloatTensor): A input sequence passed to encoders. Typically for inputs this will be a padded FloatTensor of size
(batch, seq_length, dimension)
.input_lengths (torch.LongTensor): The length of input tensor.
(batch)
- Returns
Result of model predictions that contains predictions, logits, encoder_outputs, encoder_output_lengths
- Return type
-
forward
(inputs: torch.Tensor, input_lengths: torch.Tensor) → Dict[str, torch.Tensor][source]¶ Decode encoder_outputs.
- Parameters
inputs (torch.FloatTensor) – A input sequence passed to encoders. Typically for inputs this will be a padded FloatTensor of size
(batch, seq_length, dimension)
.input_lengths (torch.LongTensor) – The length of input tensor.
(batch)
- Returns
- Result of model predictions that contains predictions,
encoder_outputs, encoder_output_lengths
- Return type
-
greedy_decode
(encoder_outputs: torch.Tensor, max_length: int) → torch.Tensor[source]¶ Decode encoder_outputs.
- Parameters
encoder_outputs (torch.FloatTensor) – A output sequence of encoders. FloatTensor of size
(batch, seq_length, dimension)
max_length (int) – max decoding time step
- Returns
Log probability of model predictions.
- Return type
logits (torch.FloatTensor)
-
joint
(encoder_outputs: torch.Tensor, decoder_outputs: torch.Tensor) → torch.Tensor[source]¶ Joint encoder_outputs and decoder_outputs.
- Parameters
encoder_outputs (torch.FloatTensor) – A output sequence of encoders. FloatTensor of size
(batch, seq_length, dimension)
decoder_outputs (torch.FloatTensor) – A output sequence of decoders. FloatTensor of size
(batch, seq_length, dimension)
- Returns
outputs of joint encoder_outputs and decoder_outputs..
- Return type
outputs (torch.FloatTensor)
-
set_beam_decode
(beam_size: int = 3, expand_beam: float = 2.3, state_beam: float = 4.6)[source]¶ Setting beam search decode
-
test_step
(batch: tuple, batch_idx: int) → collections.OrderedDict[source]¶ Forward propagate a inputs and targets pair for test.
- Inputs:
train_batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)
-
training_step
(batch: tuple, batch_idx: int) → collections.OrderedDict[source]¶ Forward propagate a inputs and targets pair for training.
- Inputs:
train_batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)
-
validation_step
(batch: tuple, batch_idx: int) → collections.OrderedDict[source]¶ Forward propagate a inputs and targets pair for validation.
- Inputs:
train_batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)